Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Semi-supervised representation learning method combining graph auto-encoder and clustering
Hangyuan DU, Sicong HAO, Wenjian WANG
Journal of Computer Applications    2022, 42 (9): 2643-2651.   DOI: 10.11772/j.issn.1001-9081.2021071354
Abstract501)   HTML55)    PDF (1000KB)(398)       Save

Node label is widely existed supervision information in complex networks, and it plays an important role in network representation learning. Based on this fact, a Semi-supervised Representation Learning method combining Graph Auto-Encoder and Clustering (GAECSRL) was proposed. Firstly, the Graph Convolutional Network (GCN) and inner product function were used as the encoder and the decoder respectively, and the graph auto-encoder was constructed to form an information dissemination framework. Then, the k-means clustering module was added to the low-dimensional representation generated by the encoder, so that the training process of the graph auto-encoder and the category classification of the nodes were used to form a self-supervised mechanism. Finally, the category classification of the low-dimensional representation of the network was guided by using the discriminant information of the node labels. The network representation generation, category classification, and the training of the graph auto-encoder were built into a unified optimization model, and an effective network representation result that integrates node label information was obtained. In the simulation experiment, the GAECSRL method was used for node classification and link prediction tasks. Experimental results show that compared with DeepWalk, node2vec, learning Graph Representations with global structural information (GraRep), Structural Deep Network Embedding (SDNE) and Planetoid (Predicting labels and neighbors with embeddings transductively or inductively from data), GAECSRL has the Micro?F1 index increased by 0.9 to 24.46 percentage points, and the Macro?F1 index increased by 0.76 to 24.20 percentage points in the node classification task; in the link prediction task, GAECSRL has the AUC (Area under Curve) index increased by 0.33 to 9.06 percentage points, indicating that the network representation results obtained by GAECSRL effectively improve the performance of node classification and link prediction tasks.

Table and Figures | Reference | Related Articles | Metrics
Label noise filtering method based on dynamic probability sampling
Zenghui ZHANG, Gaoxia JIANG, Wenjian WANG
Journal of Computer Applications    2021, 41 (12): 3485-3491.   DOI: 10.11772/j.issn.1001-9081.2021061026
Abstract264)   HTML13)    PDF (1379KB)(124)       Save

In machine learning, data quality has a far-reaching impact on the accuracy of system prediction. Due to the difficulty of obtaining information and the subjective and limited cognition of human, experts cannot accurately mark all samples. And some probability sampling methods proposed in resent years fail to avoid the problem of unreasonable and subjective sample division by human. To solve this problem, a label noise filtering method based on Dynamic Probability Sampling (DPS) was proposed, which fully considered the differences between samples of each dataset. By counting the frequency of built-in confidence distribution in each interval and analyzing the trend of information entropy of built-in confidence distribution in each interval, the reasonable threshold was determined. Fourteen datasets were selected from UCI classic datasets, and the proposed algorithm was compared with Random Forest (RF), High Agreement Random Forest Filter (HARF), Majority Vote Filter (MVF) and Local Probability Sampling (LPS) methods. Experimental results show that the proposed method shows high ability on both label noise recognition and classification generalization.

Table and Figures | Reference | Related Articles | Metrics